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1 Introduction 

This report summarizes the work that has been done on the project from April 1, 1992 to March 
31, 1993. The main goal of this research is to develop a practical tool for rotorcraft control system 
design based on interactive optimization tools (CONSOL-OPTCAD) as well as classical rotorcraft 
design considerations (ADOCS). This approach enables the designer to combine engineering in- 
tuition and experience with parametric optimization. The combination should make it possible 
to produce better design faster than would be possible using either pure optimization or pure 
intuition and experience. 

We emphasize that the goal of this project is not to develop an algorithm. It is to develop a 
tool. We want to keep the human designer in the design process so as to be able to take advantage 
of his or her experience and creativity. The role of the computer is to perform the calculation 
necessary to improve, and to display the performance of the nominal design. 

Briefly, during the first year we have connected CONSOL-OPTCAD, an existing software 
package for optimizing parameters with respect to multiple performance criteria, to a simplified 
nonlinear simulation of the UH-60 rotorcraft. We have also created mathematical approximations 
to the Mil-specs for rotorcraft handling qualities and input them into CONSOL-OPTCAD. Finally, 
we have developed the additional software necessary to use CONSOL-OPTCAD for the design of 
rotorcraft controllers. 

In order to meet the specification we do not actually have to solve an optimization problem 
(i.e., we are not looking for a global minimum for the cost functions). In fact, we are just looking 
for a feasible solution, i.e., a solution which satisfies all the mathematical constraints (handling 
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qualities requirements). Using the optimization package CONSOL-OPTCAD we actually try to 
solve this problem by min/max optimization techniques [1]. 

If the problem, has a feasible solution, and the performance criteria are convex and smooth 
(with respect to the design parameters), a solution can be found iteratively by minimizing the 
maximum of the normalized (weighted) performances (min/max). However, in complicated prob- 
lems, such as design of rotorcraft flight controls, the system performance criteria are usually not 
convex and not smooth. Moreover, it is not clear whether there exists a feasible solution or not. 
For these cases we can still use the min/max techniques to obtain the best possible solution (i.e., 
most of the requirements are satisfied and the rest are “almost” satisfied). In order to solve this 
problem we first have to transfer the system performance criteria and constraints into smooth 
functions (see Paragraph 2.2). Then we have to choose two sets of parameters. One is the initial 
guess for the design parameters. The second is the performances weighting factors. Note that 
right choice of these two parameter sets may increase the rate of convergence, where on the other 
hand, a poor choice may cause the algorithm to diverge. 

Apparently, this is an infinite choice tradeoff problem. However we can use a-priori knowledge 
and engineering intuition to choose the optimization parameters wisely (e.g., use ADOCS param- 
eters as the initial guess). Moreover, using CONSOL-OPTCAD allows us to change our choice at 
each iteration during the design process, so using this feature one can develop a tradeoff strategy 
and even an optimal design “intuition”. This idea, of using mixed optimal/classical techniques 
has already been used in the LQG/LTR [2], and in the symmetric root-locus [3] methods. In both 
methods the optimization step ensures a certain performance level (at least stability), while the 
other step allows the designer to improve the performance using classical considerations. These 
methods, and hopefully our approach as well, compensate for the loss of intuition that usually 
occurs when using pure optimal design techniques. 

The organization of the report is as follows. In Section 2, we describe the completed parts of 
the research, including the helicopter model and the handling qualities requirements. In Section 
3, we discuss in detail the two major problems that we faced, and their solutions. These problems 
are emphasized for two reasons. First, this information may be important for further works. 
Moreover, the proposed research outline described in [4] and [5] has been changed in order to solve 
these problems and others, such that we are now several months behind the original schedule. A 
short description of the present status is given in Section 4. A program and schedule for future 
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work are given in section 5, the final section. 

Practically for the given specifications, one of the design goals is to ensure the internal stability 
of the closed-loop system. That is, the closed-loop system matrix A in a state-space representation 
of the closed-loop system is required to have all eigenvalues in the open left of the complex plane. 
This problem can be formulated as a non-smooth optimization by minimizing the largest real part 
of the eigenvalues of A . Typically, the process of minimization will tend to converge to a solution 
where many eigenvalues of A have the same real part. Moreover, it may not be possible to express 
the non-smooth objective function as the maximum of finitely many smooth functions and this 
difficulty presents a need in theoretical research. Progress has been made along this line by our 
sub-contractor at Georgia Tech. Among other things, a smooth reformulation of the problem and 
an algorithm are proposed such that a quadratic rate of convergence is often achieved. A report 
summarizing the results is attached in Appendix C. This algorithm will be eventually implemented 
in CONSOL-OPTCAD to simultaneously address other design specifications. 

2 Completed parts 

2.1 A model for the UH-60 in hover 

The UH-60A (Black Hawk) helicopter in hover, was chosen as a benchmark example for this 
research. The comprehensive rotorcraft aerodynamic model (UM-GenHel) [6] requires much too 
much computer time for one simulation run to be useful for the purposes of this research. Thus we 
wrote a simplified model which can be used by the optimization package (CONSOL-OPTCAD). 
The simplified model represents the helicopter linear dynamics and aerodynamics, as well as the 
most important system nonlinearity (i.e., actuator saturation). The specific representation of the 
helicopter in hover is, in fact, a part of the design process setup. Therefor this model has been 
modified several times in order to obtain the system performance measures as well as to satisfy 
some computational limitations. 

The final configuration is shown in Figure 2.1, including: 

(i) P(s) - Linearized UM-GenHel 11 states model (9 states for the 6 DOF fuselage dynamics and 
2 states representing the rotor flapping motion). 

(ii) Actuator Model - The swashplate actuators are modeled using standard saturation functions 
for displacement and rate limits and a 1** order approximation for the actuator dynamics (see 
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Figure 2.2). Note that the displacement saturation is implemented on the actuator command as 
is standard [7]. 
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Figure 2.1 Block diagram of the simplified rotorcraft model 
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Figure 2.2 Block diagram of the actuator model 
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Figure 2.3 Detailed model of the feedforward portion of the rotorcraft controller 

(iii) Delay - Pure delay which represents the overall system delay (computation delays, A/D and 
D/A delays, unmodeled dynamics etc.). 

(iv) HyF(s) - are ADOCS controllers. H is a constant gain, output feedback matrix, /"(s) is a 
decentralized feedforward dynamic controller obtained from the model following concept [8], based 
on a command model dynamics Af(s) and on a 1 #< or 2 nd order approximation for the helicopter 
dynamics P a (s), see Figure 2.3. Note that if F 0 (s) = -P(s) then the resultant closed-loop transfer 
function is M(s). 

The overall closed-loop system is controlled by the pilot using four input commands 6 = 
(^, tf c ) representing respectively the lateral, longitudinal, main rotor collective, and tail 
rotor collective cockpit commands. The output is y = (u, which stands for the 

longitudinal velocity (u), lateral velocity (v), vertical velocity (in), roll rate (p), pitch rate (9), yaw 
rate (r), roll angle (<£), pitch angle (0), and yaw angle (V>). The disturbance vector d = (d*, d*, d^) 
allows us to simulate the helicopter angular rate changes caused by wind gusts. 

In order to calculate efficiently all the desired performance measures, this model has two 
different versions. There is a continuous-time linear model, for small amplitude performance, 
where the actuator model A(s) is a simple I st order model (no saturation) and the delay D(s) is a 
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2 nd order Pade' approximation. There is also a discrete-time nonlinear model, for large amplitude 
performance, where all the continuous time parts (i.e., P(s) and F(s)) are replaced by suitable 
ZOH equivalents. The nonlinear simulation is implemented by solving the difference equations 
recursively. For more details see MATLAB M-files, init.m and simu.m in Appendix A. 

An important component of the initial task of this project was to verify the helicopter model. 
The linearized reduced order UM-GenHel model (11 states) was compared with the full UM- 
GenHel model (33 states) [6], and with the simplified 6 DOF model (9 states) [8]. In addition, the 
linear and the nonlinear closed-loop models were checked and compared using the “real” ADOCS 
parameters (i.e., feedback gains, and feedforward parameters). For further discussion of model 
verification see Paragraph 3.2. 

2.2 Design specifications and their translation to mathematical functions 

The following five specifications were identified as those that are essential to meet the ADS-33C 
(Aeronautical Design Standard) [9]. The helicopter is considered to hover ([9], Paragraph 3.3 - 
hover and low speed). The flight control system has an ACAH (Attitude Command Attitude 
Hold) response type. The standard to be met is performance LEVEL 1 for all MTEs (Mission 
Task Elements) excluding target acquisition and tracking for UCE (Usable Cue Environment) 2 
and 3 (for detailed definitions see [9]). 

These five requirements are naturally divided into two groups. The first two requirements 
relate to small-amplitude response and can be checked using linear models. The remainder are 
related to moderate-amplitude response and must be checked by nonlinear simulation. 

In some cases the original requirements were changed slightly in order to achieve mathemat- 
ical properties (i.e., smoothness) that simplify the optimization process. However, because of 
saturation the smoothness property is not always guaranteed. 

2.2.1 Small-Amplitude Changes, Short-Term Response to Control Inputs 

This modern frequency based criterion (band width /phase-delay) replaces the traditional specifica- 
tions which used limits based on time-delay and rise-time. The bandwidth/phase-delay criterion 
emphasizes features directly related to closure of the piloted loop, and it is a better metric than 
rise-time for the prediction of handling qualities for small-amplitude precision tracking tasks. It 
is clear that pilots are also sensitive to the shape of the phase curve at frequencies beyond the 
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phase [deg] phase [deg] 


bandwidth frequency. This shape is characterized by the phase-delay parameter [7]. Thus, the 
level regions for the short-term response requirement are defined in the bandwidth/phase-delay 
plane as shown in Figure 2.4. Actually, for small phase-delay systems this is a “pure bandwidth” 
criterion. Above a certain value of phase-delay (about 0.2 sec) it becomes a tradeoff between 
bandwidth and phase-delay (i.e., the pilot can tolerate higher phase-delay but then, in order to 
achieve the same performance level, he needs higher bandwidth). 



frequency [rad/sec] frequency [rad/sec] 



frequency [rad/sec] , Bandwidth [rad/sec] 


Figure 2.4 Spec 1 : Bandwidth vs. Phase-delay 

(ire) Bode phase plot, bandwidth and phase-delay of three examples, (d) Their D meassures in the Spec 1 plane. 
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Bandwidth and phase-delay are measured from a frequency response (Bode) plot of angular 
attitude response to cockpit controller input. Bandwidth and phase- delay, as defined in the 
specification [9], Paragraphs 3.3.2.1 (Pitch, Roll) 3.3.5.1 (Yaw), are referenced to the helicopter 
with all augmentation loops closed. Thus, they can not be simply calculated from the closed-loop 
design parameters. Usually, two bandwidth frequencies are measured: the frequency for 6 db gain 
margin (u> BW , ain ), and the frequency for 45° of phase margin {vBW phag e ). For ACAH response 
types VBW phase taken, since the nature of ACAH is such that the pilot does not have to close the 
attitude loop for stabilization purposes, so the gain margin problems are less apparent. Phase- 
delay is defined so that it represents all of the contribution to phase less than —180°, and is based 
on the observation that the phase curve tends to be linear. 

In order to meet the LEVEL 1 (also true for other levels) requirement a two dimensional geo- 
metrical measure D (normalized quadratic distance) is defined, such that minimizing this measure 
implies better performance. The computation of this measure is done in three steps. First, the 
graphical level curve is converted into an analytic smooth function using a polynomial curve fitting 
algorithm. In order to achieve a univalent function the level curve is represented as u; B w = /(r^), 
i.e, the bandwidth frequency is a function of the phase-delay, which is a nondecreasing C°° func- 
tion, Figure 2.4 d (the dotted curve). This calculation is done only once in the initialization 
routine (for more details see init.m in Appendix A). The other two steps are executed in each iter- 
ation. First, the bandwidth and phase-delay are computed, based on the above definitions, using 
an efficient search algorithm. Second, the D measure is calculated as the minimum normalized 
quadratic distance from the current (u^BWy^d) point to the level curve. Moreover, measure D is 
calculated only for points which are not in the LEVEL 1 region (i.e., they are to the left of the 
dotted curve in Figure 2.4 d), and it is set to zero for any (uBW,Td) point within the LEVEL 1 
region. This definition makes D(ubw, Td) differentiable even on the boundary of the LEVEL 1 
set (i.e., u B \v = /(^)), and gives an identical weight for any point in the desired set (LEVEL 1 
region). The computation of D uses the fact that the level curve is a nondecreasing function, so it 
eliminates the need to evaluate the function u B w — f{jd) a.t each l>bw — f( T d) (for more details 
see dLbw-pd.m in Appendix A). 
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2.2.2 Small-Amplitude Changes, Mid-Term Response to Control Inputs 

This requirement is, in general, the complementary part of the short-term response requirements 
of Paragraph 2.1.1. The short-term criterion emphasizes features related to the high-frequency 
modes, whereas the mid-term influences mainly the low-frequency modes. Although, because of 
the particular definitions of both criteria, it is unavoidable that this requirement overlaps the 
short-term one. The mid-term requirement is specified for two different pilot operation modes: 
“Fully Attended Operations” - where all of the helicopter tasks can be accomplished with full 
pilot attention to aircraft control (e.g., other crew members handle the non-control tasks), and 
“Divided Attention Operations” - where the pilot should be able to relinquish control of the 
helicopter for short periods of time without encountering significant excursions. Consideration of 
divided attention (as done in this research) ensures that, at least practically, any flight control 
system which meets this requirement is BIBO stable. In fact a helicopter which meets the fully 
attended requirement can have unstable mid-term response (as with many present-day helicopters). 
For more information see [9], Paragraphs 3.3.2.2 (Pitch, Roll) 3.3.S.2 (Yaw). 

The parameter that is used in this requirement is the damping ratio £. This parameter, which is 
well defined for second-order systems, can be interpreted in several different ways for higher order 
systems. Unfortunately, most of these interpretations lead to numerical algorithms which are either 
significant time consumers, or are not smooth enough, or both. For example, computing £ as the 
logarithmic decrement of the two first peaks of the system step response is very time consuming. 
On the other hand, using eigenvalues, in order to find £ as the ratio between the imaginary and 
the real part of the 2 n ^order approximated model is, in general, not a smooth calculation. The 
computer time problem becomes critical because this algorithm has to be executed in a multi- 
iteration optimization process. The smoothness problem is sometimes even more critical since it 
may cause failure of the optimization process. 

Usually, ( defined only for stable systems. Thus practically, this requirement also implies sta- 
bility. Indeed, the 2 nd order approximation is based on two parameters a and b ^g(s) = 
which completely define the system stability (i.e., a > 0, b > 0). Calculating ( from a and b 
(i.e., £ = implies that b must be positive (for numerical reasons, since £ is a real number). 
Therefor, although it is not explicitly required, the optimization algorithm is forced to search for 
a solution only for b > 0 (to avoid numerical failure this requirement is translated to a “hard 
constraint”, see [1]). In the future, in order to ensure stability, we will replace this constraint by 


9 



theta [deg] 



c. Heading (yaw rale) channel 



d. Spec 2 plane 



-z*Wn 


Figure 2.5 Spec 2 : Damping ratio 

Responses for longitudinal (a), lateral (b), and heading (c) channels, (d) Their ( me assures in the Spec 2 plane. 


the algorithm described in Appendix C. Under this constraint, and by expanding the definition 
of C to hold also for negative values, the system stability becomes a function of ( only (i.e., stable 
for C>0, and unstable for ( < 0). 

During this work we have examined two ways for calculating (, both based on model reduction 
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peak_rate/step_hight [1/sec] 


techniques. First, we tried to use algebraic methods such as minimal realization using both a 
pole/zero cancellation method and a state elimination method. Unfortunately, these methods can 
not be applied for arbitrary models (i.e., the minima] size of the reduced model depends on the size 
of the full one). Thus usually they do not guarantee reduction to 2 nd order. In the current approach 
the 2 nd order approximation is obtained by using 2 nd order system identification. In this technique 
a second-order AR (auto- regressive) model is estimated using the LS (least-square) method. This 
approach needs only a few time points (about 20) and the LS algorithm ensures smoothness (for 
more details see zeia.m in Appendix A). Typical closed-loop, real and approximated, responses 
are shown in Figure 2.5. 

2.2.3 Moderate- Amplitude Attitude Changes (Attitude Quickness) 

FYequency domain based criteria (e.g., bandwidth) fail in the presence of strong nonlinearities 
such as saturation. Therefore, in order to check the helicopter performance during large maneu- 
vers we have to define alternative criteria. Let a denote the nominal value of the angular position 
for step response (i.e., o = step height), and let denote the peak value of the angular 
velocity (p, q,r). Then, for linear systems the ratio s p */a is directly related to the system band- 
width [7]. Using this criterion instead of one of the classical linear measures of bandwidth allows the 



Figure 2.6 Spec 3 : Quickness ratio 

Level 1 performance is achieved if the quickness ratio is above the curve shown. 
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specification to make the required bandwidth a decreasing function of the size of the maneuver, as 
shown in Figure 2.6. For nonlinear systems, especially with saturation nonlinearities, this conces- 
sion is essential (i.e, it is unreasonable to require the same “bandwidth” for all input heights). In 
these cases it is not correct to interpret this ratio as a bandwidth, but it is better interpreted as a 
measure of agility (i.e., ”quickness” ratio). For more information see [9], Paragraphs 3.3.3 (Pitch, 
Roll) 3.3.6 (Yaw). 

The calculation of the quickness ratio s p k /a is very simple. The max(-) operator is used over the 
sampled rate vector. It is then normalized by the input height. Note that using the max(-) operator 
in this case does not destroy smoothness because we try to maximize the quickness ratio and the 
combination max /max is smooth. However, smoothness may not hold due to the saturation. This 
test theoretically has to be checked for an infinite number of step inputs. Practically we check it 
for each channel for only three different inputs (for more details see simu.m in Appendix A). 

2.2.4 Interaxis Coupling 

Helicopters dynamics are naturally interaxis coupled. The following requirements relate to the 
two common helicopter interaxis couplings. These are the cross-coupling between pitch and roll 
(i.e., pitch/roll and roll/pitch), and yaw (rate) due to collective. These couplings are caused 
mostly by aerodynamic rotor moments and by the unsymmetric tail moments. Both couplings 
would adversely affect the pilot’s ability to complete some high maneuvers tasks. One of the most 
important design objectives is to minimize these couplings (e.g., using closed-loop techniques such 
as: high-gain, LQR, etc.). 

The decoupling requirement is mostly significant for high-maneuver responses. Thus, it is 
checked only for large amplitude step responses (The coupling measures are relative measures. 
Hence, in the case of linear systems small input amplitudes can be used as well). The natural 
measure is used for the cross coupling between pitch and roll (i.e., the ratio of peak off-axis 
response to desired response, 8 p kl4>dt * > and <t> p k!0dt» )• To avoid use of nondifferen table functions 
such as max(-) and afrs(-), suitable upper and lower limits are defined to bound the response over 
the relevant time interval. 

For the yaw- to- collective decoupling requirement a much more involved criterion is required. 
This criterion is designed to meet the pilot’s needs during aggressive tasks. This criterion is a 
measure of not only the magnitude, but also the shape of the yaw rate response to a step collective 
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stick input. The shape of the yaw rate is taken into account by measuring the peak yaw rate rl 
and value of yaw rate after 3 seconds. In case there is no peak in the time interval [0,3] the yaw 
rate at 1 second r(l) is taken instead ([9], Paragraph 3.3.9). This may cause the measure to be 
not smooth with respect to the design parameters. It is dear that this can happen only if, during 
the optimization process, the damping ratio ( of passes the value (i.e., ((n) < and 
<(« + 1) > ^ or v * ce versa > where n is the iteration number). To avoid this singularity we take 
= r 0) regardless of the peak time (i.e., even if the peak occurs within the time interval [0,3]). 
In addition to the above requirement, it is also required that the maximum oscillation amplitude, 
following a step collective change is below a certain limit. IVom accumulated experience this limit, 
has to have units [7], (i.e., it is not relative as in the pitch-roll case). All the information required 
for the coupling specs is taken from the largest input simulation of the quickness test (for more 
details see sirnu.m in Appendix A). 

2.2.5 Wind Gust Response 

The model-following concept allows the designer to increase the I/O bandwidth (theoretically 
unlimited) by changing the parameters of the desired model M(s ) only. Actually, this is the ma- 
jor advantage of using a model-following control, because in most cases the designer can not achieve 



Figure 2.7 Spec 5 : Wind-gust model and wave form 
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the desired bandwidth due to closed-loop stability limitations (e.g., limited state feedback gains). 
However, this approach has some disadvantages. One of them is that the disturbance rejection 
requirement (system “stiffness”) is no longer directly related to the overall system bandwidth (i.e., 
we can have a high-bandwidth, low- stiffness system). Therefore, the original wind-gust rejection 
criterion ([9], Paragraph 3.2.6) is not suitable for the ADOCS configuration. A new requirement is 
defined [10], by using an approximated gust model, where the gust peak value is chosen to fit the 
disturbance input point (the wind-gust is applied directly to the helicopter state equations). The 
wind-gust input wave form is shown in Figure 2.7, the detailed definition of the new requirement 
is presented in [10]. 


Spec 5 plane 



Figure 2.8 Spec 5 : Wind-gust rejection - required envelope 

The wind gust response criterion is a two parameter criterion. Following precisely the require- 
ments for the ACAH response type [9] leads to a pulse- input settling-time criterion. The first 
parameter, settling-time f«(o) is defined by the condition that the absolute value of the response 
fy(0l ^ a V t > fj(a), where a is 10% of the response peak value. The second parameter is the 
peak value itself. Practically, settling-time is obtained using a discrete-time search algorithm, 
which may cause t,(a) to be not smooth with respect to the design parameters. Moreover, in order 
to obtain the peak value we have to use the moi(-) operator, and since the optimization algorithm 
tries to minimize this peak it may cause some smoothness difficulties. To avoid this possibility 
it is highly recommended that one uses functional constraints. Therefore, the ACAH response 
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type requirement was slightly changed such that the angular position following a wind-gust input 
should lie between the two curves of Figure 2.8. 

Remark: This test has to be simulated for 40 seconds. Thus it becomes the biggest time consumer 
of the simulation. One might think that in order to save simulation time, this test can be replaced 
by a mixed time domain (for the peak) and frequency domain (for *,(a)) test. However, it turns 
out that then the problem constraints can not be functional constraints (smoothness !) and they 
must be more conservative in order to guarantee the same performance level. 

2.3 Optimization set-up 

The computerized optimization set-up includes 3 groups of files: 

(i) Optimization definitions (see Appendix B, and [1]): 

• ADOCS - CONSOL-OPTCAD Problem Description File (PDF), contains definitions for the 
design parameters, their initial values, and their limits (if there are any). 

• SPEC#.$ - Spec files where # = 1 } 2,3,4>5; stands for the specific spec (i.e., 1 - bandwidth vs. 
phase-delay, 2 - damping ratio, etc.), and $ = pit (pitch ), rol (roll), yaw ; stands for the specific 
controlled channel. These files include the problem constraints (types, values, and weights). These 
values can be changed at each iteration. 

In the first stage of the optimization (“convert”), all the above files are included to one 
CONSOL-OPTCAD executable file. 

(ii) Model and performance (see Appendix A): 

• INIT.M - Initialization file, defines the open-loop parameters (helicopter linearized dynamics, 
actuator saturations, overall time delay, wind-gust wave form, etc.). This file is executed once, at 
the begining of each optimization run. The initial parameters can be changed (if required) after 
each iteration. 

• SIMU.M- Contains closed-loop parameters including the design parameters, and the calculation 
of the performances measures. This file is executed at each iteration. 

• **M ~ Series of function files, containing algorithms which have to be executed several times at 
each iteration (e.g., for the discrete recursive computation). 

(iii) Monitoring tools: 

This group of files allows the user to monitor the optimization process at any time, even if the 
process is running in the background. These tools were developed for two reasons. 
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Figure 2.9 Performances map (initial) 




First, the present version of CONSOL-OPTCAD has limited graphical outputs (a new version 
including a strong graphical interface is currently being developed). Moreover, it is desirable to 
display all the performance measures at their specific (not standard) planes, preferably in one 
screen (page), as shown in Figure 2.9. This feature gives the designer the ability to see the present 
performance map, at any time, just by pushing one button. 

3 Major Problems 

During this research we faced many problems (mathematical problems, numerical problems, tech- 
nical difficulties, computer bugs, etc.). Two of these problems had substantial influence on the 
research outline and rate, in particular the first one (3.1), for which we spent several months to 
find the present (probably not the optimal) solution. 

3.1 Simulation running time 

In retrospect, we can say that this problem is a direct result of two poor choices made at the begin- 
ning of this research: choosing MATLAB as the main simulation tool and the linear continuous- 
time model as a starting test case. However, in this early stage, these choices seemed to be the 
right ones. At the beginning, we used a simplified test case, in order to develop and to check 
some of the performance measures. Then, MATLAB was a natural choice, mainly because of 
its powerful linear algebra tools. Although, MATLAB is an interpreter (i.e., its code can not be 
compiled to produce an executable file) the running time per one CONSOL-OPTCAD iteration 
was reasonable (several minutes). However, when we started to complicate the model (i.e., adding 
the saturation nonlinearity) the running time increased dramatically (more than 10 hours per one 
iteration). The main source for this unacceptable simulation duration was the nonlinear simula- 
tion. This was initially implemented by using numerical integration using the Runge-Kutta 4 th or 
5 th order variable step size algorithm (see MATLAB M-file, ode45.m). This method is relatively 
slow, in particular in MATLAB, where it is implemented very inefficiently. 

At the first step, we tried to speed up the algorithm by converting the M-file to a C MEX-file 
(compiled subroutine) and by improving the original algorithm interface (see MATLAB MEX-file, 
ode45m.c), but the minimum simulation time that we achieved was about 10 hours per optimization 
iteration. The second step was to replace the continuous time model with a discrete time one, and 
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then the numerical integration was replaced by an efficient recursive calculation. Moreover, this 
change saved 8 states used for the time delay Pade # approximation (in discrete time, time delay 
has a very simple implementation). Finally, after some more simplifications (e.g M reduce rotor 
dynamics to a 2 nd order flapping model, replace wind-gust model with its a-priori calculated wave 
form, etc.), we ended with 30 - 40 minutes per iteration. During the research, we have considered 
switching to an independent simulation code written in C or FORTRAN, but it now seems that 
that is no longer worthwhile, because all our working environment is based on MATLAB. 

This problem affected the research by consuming the time needed for solution searching. More- 
over, it implies that in the next research steps, in order to get results in reasonable time, we must 
use a smart computer run policy as discussed in Section 4. 

3.2 A bug in UM-GenHel code 

At the beginning of the model verification (August 1992), we realized that the given linearized 
model gave wrong results. First, we verified that closing the loop with the real ADOCS parameters 
[11] produced an unstable closed-loop system. After several months (December 1992) of testing and 
comparing the linearized model and the UM-GenHel/ADOCS interfaces without any improvement 
the nonlinear UM-GenHel simulation was compared with previous simulation results to find that 
there was a bug in the UM-GenHel code. It is not clear why or when this bug appeared ? 
However, it is clear that the wrong model was used at least in one research [12]. Finally, a new 
version was installed and, after some adjustments were made (February 1993), we obtained the 
present linearized model. 

4 Present status 

The main activity now is the tradeoff runs. We have already finished the first set of nominal 
runs starting with an arbitrary initial guess, (e.g., ADOCS parameters [11]), and with a fixed 
arbitrary set of constraint weights for a fixed number of iterations. However, since we chose all the 
optimization parameters arbitrarily, and we did not apply any tradeoff strategy, the final solution 
is an arbitrary one, and hence it is not the “optimal” one. The final and the initial performance 
maps for this case are shown in Figure 2.9 and Figure 4.1 respectively. 

Unfortunately, as we already mentioned, this design problem is not convex and maynot be 
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Figure 4.1 Performances map (final) 





Spec No. 

Measure 

Pitch 

Roll 

Yaw 

1. 

distance 

(1) 

(2) 

(3) 

2. 


[4] 

[6] 

[8] 


C 

(5) 

(7) 

(9) 


small 

(10) 

(13) 

(16) 

3. 

medium 

(11) 

(14) 

(17) 


large 

(12) 

(15) 

(18) 






4. 

coupling 

< 1,2 > 

< 3,4 > 

1 






5. 

peak 

< 7,8> 

< 11, 12 > 

< 15, 16 > 


t,(a) 

< 9,10 > 

< 13,14 > 

< 17, 18 > 


Table 4.1 Problem constraints 


smooth everywhere. Therefore, in order to help the optimization process to converge to an accept- 
able solution we must adjust the design degrees of freedom: initial guess, and constraint weights. 
At the first step (currently developed) we are going to use a tradeoff strategy only for the con- 
straints weights. This may give us the ability to control the local solution (the global solution 
strongly depends on the initial guess). For this purpose we have a set of design DOFs, which are 
the “soft constraints” shown in Table 4.1, where each ($) or < $ > stands for one design DOF 
(total 26). Note that (#) (or (#, #)) is a CONSOL-OPTCAD “soft constraint” (or a pair of “soft 
constraints”), [#] is a “hard constraint”, which has no weight(i.e., it is not a design DOF), and 
< # > (°r <#,#>) is a “functional (soft) constraint” (or pair of “functional constraints”). For 
more details see spec files in Appendix B, and (1]). 

The “good values” ( GV ), which are exactly the spec requirements, remain fixed during the 
design process and the constraints weights ( CW ) are directly related to the “bad values” (W). 
Practically, in the CONSOL-OPTCAD iteration we can change the “bad values” using the 
factors Pij such that: 

BVij = (1 - 1 /fiij)GV + 1 IfaBVij-n * = 1,2,. . .,26, 
which implies: 
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CWij = fiijCWij-i 

This approach using a normalized multiplicative weighting factor, easily allows us to interfere 
at each CONSOL-OPTCAD iteration, and “push” the algorithm toward the right direction. 

5 Proposed work 

We have completed approximately half of the tasks we originally scheduled during the first year. 
The only real change to our protocol is the use of somewhat more simplified model in the design 
process than was originally planned. We do not believe this will introduce any changes in the rest 
of the project. Our primary goal remains the development of CONSOL-OPTCAD as a tool for 
the design of control systems for advanced rotorcraft. It is very important to understand that we 
are developing a tool, not an algorithm. A human designer plays the crucial role in designing the 
controller. As with many sophisticated and complex tool it is not possible to simply hand our 
customization of CONSOL-OPTCAD to a designer and say, “Go design a controller”. There is 
a great deal to learn about how to set-up the design problem and about how to force CONSOL- 
OPTCAD to arrive at a good design. This is the major focus of our proposed research for the 
coming year. 

As last year, there are two major issues, controller structure and controller parameters. CONSOL- 
OPTCAD is fundamentally a parameter optimizer. Thus, it must be given the structure of the 
controller. There are several ways to do this. We will first use the ADOCS structure and op- 
timize its parameters, as we had originally planned to do last year. We view this as a form of 
inverse problem. That is, as a first step we will try to force CONSOL-OPTCAD to converge to 
the ADOCS design. We know this is not a perfect design but it is a very useful way to learn 
how the parameters with which the designer influences the design (the good and bad values of the 
mathematical versions of the specs primarily) influence both the evolution and the result of the 
CONSOL-OPTCAD computations. 

Once we have done this we would like to experiment with the choice of controller structures. 
We will first use our intuition and conversations with various rotorcraft experts to produce some 
other controller structure. We will then use CONSOL-OPTCAD to choose good parameters for 
those controllers , study their sensitivity to changes in the controller parameters, and evaluate 
their robustness. If time permits we would then also like to try an LQR/LTR design using the 
methodology outlined outlined in the introduction. 
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We propose to follow the schedule given below. 

1. Design studies using the ADOCS structure (completed by August 1993) 

1.1. Study performance specification tradeoffs. 

1.2. Study controller sensitivity to specification. 

1.3. Study controller robustness. 

1.4. Solve the inverse problem - i.e., force CONSOL-OPTCAD to converge to the ADOCS 
parameter values. 

2. Design studies using other prespecified structures (completed by Jan. 1994) 

2.1 - 2.4. Repeat the studies done for the ADOCS structure. 

3. Design studies using LQR/LTR (completed by March 1994) 

3.1 - 3.4. Repeat the studies done for the previous structures. 

Remark: The third task will only be done if time permits. We believe it would be very useful 
to demonstrate the possibility of combining CONSOL-OPTCAD with different optimal control 
methods but it is not necessary for the success of the proposed project. 
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Appendix B - Some CONSOL-OPTCAD files 
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functional_conatraint *p guat p u* soft 
for t from 0 to 10-dtg by dtg 
{ import dtg; 


Appendix C - 

Stabilization via Smooth Optimization: 
Continuous-Time Case 


Chin-Yee Lin Michael K.H. Fan 

School of Electrical Engineering 
Georgia Institute of Technology, Atlanta, GA 30332 


Abstract 

Let A(x) be an n x n real matrix whose elements are analytic functions of x € R m . In this 
paper we consider the problem on minimizing the largest real part of the eigenvalues of A(x). This 
problem is in general non-differentiable and non-convex. It is shown that an existing algorithm 
proposed by Fan [2] for solving a class of smooth constrained problems can be applied here. This 
algorithm enjoys the property that, if started close enough to a local minimizer £*, then it will 
converge to x* quadratically. To ensure numerical stability and efficiency, our derivations employ 
real Schur decomposition instead of eigenvalue decomposition of A(x). We also develop result on 
the analyticity and uniqueness of the real Schur decomposition of A(x). This result may be of 
interest on its own right. 
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C-0 Notation 

/ Identity matrix of appropriate size 

x T Transpose of vector x 

||*|| Euclidean norm of vector x 

A t Transpose of matrix A 

A > 0 Matrix A is positive definite (A > 0, A < 0, and A < 0 are defined similarly) 
tr(i4) Trace of matrix A 

j4t Moore- Penrose generalized inverse of matrix A 

% real part 

C.l Introduction 

Let A{x) be an n x n real matrix whose elements are analytic functions of z € R m . The eigenvalues 
of j4(x) are denoted by A x (x), . . A„(x), arranged in decreasing order by their real parts, i.e., 

SAj (z ) > • • •> KA„(x) 

(those eigenvalues with same real parts but different imaginary parts may be ordered arbitrarily). 
Define 

4>{x) = max RA;(x) 

So <j>(x) is the largest real part of eigenvalues of A{x). In this paper, we study the optimization 
problem 

<j > * := min <f>(x) (1-1) 

relR m 

which arises in stabilization of dynamic systems. For instance, consider the nonlinear system 

z(t) = A(x)z(t) + f(x,z(t))> z(0) = z 0 (l- 2 ) 

where /(x, z(t)) denotes the second or higher order terms of the state z(f), and z is the decision 
variable to be chosen in order to meet some design specifications. The system (1.2) is said to be 
asymptotically stable at the origin if there exists 6 > 0 such that ||zo|| < $ implies ||z(f)|| — » 0 as 
1 — ». 0. It is well-known that (1.2) is asymptotically stable at the origin if A{x) is a stable matrix, 
i.e., 4>(x) < 0. Thus, if 4>* < 0, then we could choose some z to guarantee the stability of (1.2). 

The function 4>{x) is in general non-differentiable. Typically, the process of minimization tends 
to make at least the real parts of eigenvalues coalesce at the solution. Moreover, near the solution 
when it is non-differentiable, it may not be possible to represent 4>(x) as the maximum of finitely 
many differentiable functions; this type of non-differentiability usually makes the problem difficult 
to solve. 

The function <j>(x) in general has local minimizers which are not global. Therefore, obtaining 
a global minimizer will typically require massive computation. If A(x ) is symmetric and depends 
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upon x affinely, then 4>{x) is convex in x. In this case, among many other algorithms, a quadraticaily 
convergent local algorithm has been proposed in [2] for solving (1.1). 

In this paper, we shall only be concerned with finding a local solution of (1.1). It is shown that 
the algorithm in [2] can be extended here such that, if started close enough to a local minimizer x *, 
then the proposed algorithm will converge to x * quadraticaily. The extension involves two key steps. 
First, we show that it is possible to formulate (1.1) near a local solution as a smooth constrained 
problem which satisfies the hypotheses given in [2]. For this purpose, we use some results in analytic 
perturbation of eigenvalues, e.g., the Puiseux series representation of repeated eigenvalues. Second, 
we show how to derive first and second derivatives for some functions (which involve eigenvalues 
of A(x)) to be used in the algorithm. To ensure numerical stability and efficiency, our derivation 
is based on real Schur decomposition of A{x) rather than its eigenvalue decomposition. We also 
developed result on the analyticity and uniqueness of real Schur decomposition of A(x). This result 
may be of interest on its own right. 

The paper is organized as follows. In Section C.2, we give the assumptions that will be used 
throughout the paper. In Section C.3, we show the smooth formulation of (1.1). In Section C.4, 
we briefly describe the algorithm proposed in [2] and discuss how it can be extended to solve 
(1.1). Section C.5 is devoted to the derivatives of real Schur decomposition of ,4(x) as well as the 
functions used in the proposed algorithm. Finally, numerical examples are given in Section C.6 to 
demonstrate the proposed algorithm. 


C.2 Assumptions 

In this section, we give the assumptions to be used throughout the paper. 


• A(x) is analytic for all x 6 H m . 


• There exists a local solution x* of (1.1). 


• Suppose that 


»Ai(x*) = . . . = £A 9 (x*) > RA, +1 (**) 


for some q . Then, the subset of eigenvalues (A^x*), . . . , A g (x*)} is non-defective, i.e., there 
are q linearly independent eigenvectors associated with those eigenvalues. Furthermore, let 
Z{ denote the component matrix (see, e.g., [5] for the definition of the component matrix) 
associated with A,(x*). Then, for t = 1, . . . , q and j = 1, . . . , m, the set of eigenvalues of the 


matrix 


Zi 


dA(x') 

dxj 


Zi 


is non-defective. 
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C.3 A smooth formulation 


Let x* denote a local solution of (1.1). Given x € !R m , the multiplicity of <f>{x) is said to be q 
if RAj(x) = RA,(x) > 3?A,+i(x). Suppose that the multiplicity q * of ^(x*) has been correctly 
identified. In this section, we give a smooth formulation of (1.1) near x*. Specifically, we define 
certain functions /i(x) and / 2 (x), both map from JR”* to It, that satisfy the following properties: 
(») x* is a local solution the optimization problem 

{/iOO = h(z) = 0} (3.1) 

(**') /i( x ) “d / 2 (x) are twice continuously differentible at x*, and (Hi) /i(x) and / 2 (x) satisfy the 
assumptions required by the algorithm in [2] ((Hi) will be checked after a recall of the algorithm in 
[2]). In next section, we show how to solve (1.1) by solving (3.1) using the algorithm in [2]. 

Let us first proceed with an example to illustrate the main ideas in defining such /i(x) and 
/ 2 (x). Suppose that q* is 5, i.e., 

£Ax(x*) = • • • = RA 5 (x*) > »A 6 (x*) 

Assume that the complex conjugate pairs (A^x*), A 2 (x*)} and (Aj(x*), A 2 (x*)} are nonreal and 
identical, and As^*) is real. Since q* is known by assumption, x* will still be a local solution of 
(1.1) even if the following constraint is imposed 

»Ax(x) = •■• = S?A 5 (x) (3.2) 

Also, it is easy to check that, if restricted in the constraint set defined by (3.2), then x* is a local 
minimizer of </>(x) if and only if it is a local minimizer of £?=i RA,(x). Therefore, if we define 

^( x ) = f 5Z ka *( x ) = ^5 Z a *(x) (3.3) 

1=1 ° 1=1 

and 

4 

M *) = - ^ i+1 (x)) 2 (3.4) 

1=1 

Then, we have the assertion that x* is also a local solution of (3.1). 

Now let us turn to the question on differentiability of /i(x) and / 2 (x). The function fi(x) 
defined in (3.3) is analytic at x* (see below). However, the function / 2 (x) defined in (3.4) is 
not even differentiable. Roughly speaking, this is because that the expression in (3.4) is not 
“symmetric” with respect to RAj and RA 3 or with respect to »A 2 and RA< (since A x (x*) = A 3 (x*) 
and A 2 (x*) = A 4 (x*) by assumption). To make it “symmetric”, we may add a few redundant terms, 
i.e., re-define / 2 (x) by 

/*(*) = EE (RAj(x) - RAj(x)) 2 (3.5) 

i=i i=i+i 
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It is easy to see that (3.4) is zero if and only if (3.5) is. However, the difference it makes is that 
/ 2 (x) defined by (3.5) is twice continuously differentiable at x *. This will be shown below. 

Of course, there are many other ways to define analytic /i(x) or / 2 (x), besides simply multi- 
plying them by constants. For example, we may define f 2 (x) to be the square of the right hand 
side of (3.5). Or, on the right hand side of (3.5), we may use other positive even powers other than 
squares. However, among all f\(x) and /i(x) that satisfy the required properties, it seems that 
(3.3) together with (3.5) is the most convenient pair as it requires the least information for the 
eigenvalue locations. This point shall become clear later. In fact, so far, only (3.3) together (3.5) 
satisfies all the require properties. 

We axe now ready to give the main result of this section. 

Theorem 3.1. Let x * denote a local solution of (1.1). Suppose that the multiplicity q* of <f>(x*) 
has been correctly identified. Define 

1 q * 

= (3.6) 

9 1=1 

and 

AM = E £ (»A,M - SA,(*)) J (3.7) 

1 = 1 j = *+l 

Then, x* is a local solution of (3.1). Also, at x*, /i(x) is analytic and / 2 (x) is twice continuously 
differentiable. □ 

C.4 Fan’s algorithm 

In this section, we give a brief description of Fan’s algorithm. It will be used in solving (2.1). More 
details about the algorithm can be found in [2]. 

Let x* be a local solution of (1.1). Suppose that the multiplicity q * of <^(x*) has been correctly 
identified. Recall that the functions /i(x) and /^(x) defined in Section C.2 are twice continuously 
differentiable at x*. Therefore, if both x — x* and h are sufficiently small, then /,(x + /i), i = 1,2, 
is equal to its Taylor series expansion about x, i.e., 

/,(x + h) = /<(*) + 9 ?(x)h + ±h T Hi(x)h + 0(||h|| 3 ), * = 1,2 

where ffi(x) and H t (x ) are the gradient and Hessian of /,(x), respectively. Expressions of g,(i) and 
(•*■)> i = 1,2, will be given in Section C.5, where the following properties they possess can be 
easily verified. (*) g 2 (x) = 0 if / 2 (x) = 0. (») H 2 (x) = H 2l {x) + H 22 (x)- t H 21 (x) and H 22 (x) are 
symmetric; H 2 x (x) > 0; ff 22 (x) = 0 if / 2 (x) = 0. 
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Let the columns of iV(x) (resp. i?(x )) form an orthonormal basis for the null (resp. range) 
space of Hi i(x). Since Hn(x) is symmetric, we have the identity 

N(x)N t (x) + R{x)R t (x) = I 

for every x in a neighborhood of x*. 

It is shown that under certain regularity conditions, a local solution x* satisfies 

f JV r (x*)</i(x*) = 0 

l -R T (z*)?2(z*) = 0 

which is a system of m nonlinear equations with m unknowns [2, 8]. One step of Fan’s algorithm 
consists in linearization of the nonlinear equations (4.1) at the current point x* and chooses the 
new point x* +1 as the solution of the obtained system of linear equations. It can be shown that the 
matrices N(x) and R{x) can be chosen as smooth functions of x. It follows by standard arguments 
that if started closed enough to a local minimizer x*, then it will converge to x* quadratically. 

Given h € R m , define D(x,h) to be the directional derivative of Hi i(x) in the direction h. 
Also, let D( •,•) : R"* x R m -♦ JR" 1 *™ be such that D(x,h)<f> = D(x,<t>)h for all x,h,<f> € R m (this 
is possible since D(x, h)<f> is bilinear in h and <f>). We now summarize Fan’s algorithm as follows. 

Algorithm 4.1. Let <f>i(x) = — /f|i(x)p 2 ( a: ). Define h(x ) = <f>i(x) if N(x) is a null matrix. 
Otherwise, let <p x (x) = ^(x^x), Q{x) = H x {x) + L>(x,^(x)), and define 

h(x) = <h{x) - N(x) (N T (x)Q(x)N(x))~ 1 N T (x)(Q(x)<h(x) + *(x)) 

Then, x new is defined by x new = x + h(x). □ 

C.5 Derivatives of eigenvalues 

Since A(x) is analytic, it is well-known that if A(a) has all distinct eigenvalues, then the eigenvalues 
of A(x), as functions of x, may be ordered in such a way that they are analytic at x = a. In this 
section, we assume that A(a) has all distinct eigenvalues and derive its derivatives. The result 
will be used in the next section for the derivatives of /i(x) and fi(x) (which are required in the 
proposed algorithm for solving (2.1)). 

To obtain the derivatives, we may proceed with an eigenvalue decomposition of A(a), i.e., to 
have a nonsingular matrix T such that the similarity transformation T~ 1 A(a)T is diagonal. The 
advantages of being able to use a diagonal matrix instead of a full one are obvious. However, 
we choose not to do so. This is because that the matrix T may be ill-conditioned in the sense 
that its inverse may be large. In this case, it is well-known that the similarity transformation 
will magnify significantly any error occurred in the matrix A(a) and thus may cause numerical 
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instability. Moreover, we have to carry out the computation in complex arithmetics even though 
A(a) is real. 

Instead, we use a real Schur decomposition of A(x), i.e., to have an orthogonal matrix U 
such that the orthogonal similarity transformation S = U T A(a)U is in a certain block upper 
triangular form, where the diagonal elements contain information about the eigenvalues (see below). 
The matrix S will be called a real Schur form of A(a). Orthogonal similarity transformation is 
particularly desirable since neither U nor its inverse U T can be large. In fact, it is easy to check 
that no element of U or its inverse can be greater than one in absolute value. Also, real arithmetics 
can be used throughout the computation. 

However, nothing comes for free. Unlike in the case of eigenvalue decomposition, even with 
the order of eigenvalues specified, the matrix 5 in real Schur decomposition is non-unique (in 
fact, highly non-unique). In order to have the derivatives of eigenvalues using the matrices from 
a real Schur decomposition, we need first address the following questions, (t) Given a real Schur 
decomposition of A(a) 

A(a) = U a S a V J (5.1) 

is there a choice of real Schur decompositions of A(x) 

A(x) = U(x)S(x)U(x) T (5.2) 

such that (5.2) coincides with (5.1) at x = a, and both U(x) and S(x) are analytic at a ? (ii) If 
the answer to (t) is yes, then is the choice unique ? We will show that the answer to (t) is indeed 
affirmative. Also, with a mild assumption, the answer to (ii) is also affirmative. In fact, with the 
same assumption, the matrix S(x) in (5.2) will be shown to have identical strictly lower triangular 
part for all x in some neighborhood of a. This property enables us to compute the derivatives of 
the eigenvalues using real Schur decompositions. 

We continue with a review of real Schur decomposition. Although a proof of real Schur decom- 
position can be easily found in many books in matrix computation, we include one here (which is 
borrowed from [3]) since the proof itself is useful in our subsequent developments. 

C.5.1 Real Schur decomposition 

Fact 5.1. (see, e.g., [3]) Let A € R nxn . Then there is a real orthogonal matrix U € R nxn such 
that U T AU = 5 is block triangular with lxl and 2x2 blocks on its diagonal. The lxl blocks 
contain the real eigenvalues of A, and the eigenvalues of the 2x2 blocks are the complex eigenvalues 
of A . Furthermore, the diagonal blocks may be arranged in any prescribed order. 

Proof. It is algorithmic and proceeds by a sequence of reduction of similar type. Let A],...,A n 
be the eigenvalues of A, arranged in any prescribed order (with complex pairs ordered together). 
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Suppose first that Aj is a real with a normalized eigenvector Uj. Choose zj, . . . , z n _x 6 It" such that 
form a basis of R". Apply the Gram-Schmidt orthonormalization procedure to 
this basis to produce an orthonormal basis {vx, uq,. . .,u; n _x} of R". Define U\ = [tq ti>x ... u> n -i]- 
Then, Uj AU\ has the form 


U?AU X = 


’ Ax 

♦ 

0 

. 


(5.3) 


where Aj = Ai and Ax = [u>x ••• u> n _x] r A[u>i ••• tn„_i] € R( n_1 ) x ( n_1 ) with eigenvalues 
A 2 , . . . , A n . If Ai = a + ip is nonreal with nonzero eigenvector vx = /i + iV. It can be shown 
that fi and v are linearly independent. Choose zx,...,z n _2 € R" such that zj, . . z„_ 2 } 

form a basis of R". Apply the Gram-Schmidt orthonormalization procedure to this basis to pro- 
duce an orthonormal basis {/!,/>, u>x, . . ., u>„_ 2 } of R". Define U\ = [/i 0 w\ ... tw„_ 2 ]. Then, 
Ui AUi also has the form (5.3) where this time Aj is a 2 x 2 real matrix whose eigenvalues are 
Ai and A 2 (A 2 = Ai), and Ai = [u>j ••• w n - 2 ] T A[w^ ••• tw„_ 2 ] G R( n-2 ) x ( n-2 ) with eigenvalues 
A 3 , ...,A n . Now do it over again for A\ and determine an orthogonal matrix U 2 € R(" -1 ) x ( n_1 ) 
such that 

U2 A\ U2 = 


a 2 

* 

0 

A 2 


(5.4) 


Define 


V 2 = 


’ I 

0 

_ 0 

u 2 . 


Then the matrices a nd 17i V 2 are orthogonal, and {U \V 2 ) T A{U 1 V 2 ) has the form 


(U x V 2 ) t A(U 1 V 2 ) = 


■ Ax * 
0 A a 

* 

0 

A 2 _ 


(5.5) 


Continue this reduction to produce orthogonal matrices Ui € R( n- ') x ( n- ') and VJ € R" x ", t = 
1, . . .,s, where s denotes the number of eigenvalues of A with non-negative imaginary parts. Then, 
the matrix U = U\V 2 V$ • * • V t is orthogonal and U T AU yields the desired form. □ 

It is noted that the orthogonal matrix U and the block triangular matrix S in the real Schur 
decomposition are by far non-unique; we have not only the freedom in choosing the order of the 
eigenvalues, but also, in each reduction step, the freedom in choosing an eigenvector and vectors 

Zi s. 


In the sequel, we will denote by S n the set ofnxn matrices which are real Schur forms of 
themselves, and satisfy the following properties: (*') the diagonal blocks of any element in S n are 
arranged in descending order according to the real parts of its eigenvalues (those blocks with same 
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real parts of eigenvalues may be ordered arbitrarily), and ( ii ) every 2x2 diagonal block, if any, of 
S € S n has different diagonal elements, i.e., if 

si s 2 
s 3 s 4 

is a diagonal block of 5, then Si ^ s 4 . 

By the virtue of Fact 5.1, every matrix has a real Schur form which satisfies the property (*). 
This property will be shown to greatly simplify the expression for the derivatives of fi(x). On the 
other hand, not every matrix has a real Schur form which satisfies the property (it). For example, 
let 


Then, A has at most two real Schur forms (itself and its transpose). In fact, it can be easily checked 
that every 2x2 real matrix is of the form (5.6) if and only if it is a normal matrix (a real matrix 
A is normal if AA? = A? A). Furthermore, if none of 2 x 2 diagonal blocks of a real Schur form of 
A is normal, then it can be shown that A has a real Schur form that satisfies the property (it). 

C.5.2 Derivatives of real Schur form 

We first address the analyticity of the real Schur form. 

Theorem 5.1. Suppose that ./4(a) has all distinct eigenvalues. Then the following statements 
hold, (i) There exist analytic functions U(x) and S(x), such that U(x)S(x)U T (x) constitutes a 
real Schur decomposition of A{x) in an open neighborhood of a. (ii) Let U a S a Uj be a real Schur 
decomposition of A{a) and assume S a 6 S n . Then the functions U{x) and S(x) in (i) can be chosen 
in such a way that U(a) = U a , S(a) = S„, and the strictly lower triangular part of S(x) is identical 
in an open neighborhood of a. (iii) The choice for U(x ) and 5(x) in (ii) is unique. 

Proof. Let U\ be given as in the proof of Fact 5.1 for the real Schur decomposition of A(a). Also 
let and zj t . . ., z n _j be the corresponding choice in the first step of the reduction procedure. We 
show first that, for all x in some open neighborhood of a, there exists an analytic function l/,(x) such 
that U\(d) = Ui and the matrix {x)A{x)U\{x) has the form (5.3) or (5.4). Since all eigenvalues 
of A(a) are distinct, it is well-known that the eigenvector rj(x), with t>i(a) = tq, can be chosen to 
be analytic at a [4]. Now, fix xj,.. .,z„_ 1 and consider the matrix tfj(x) in a neighborhood of a. 
It is easily checked that, in a neighborhood of a, the vectors t>i(x),xi,. . ,,x n _j still form a basis 
for E n . Also, the Gram-Schmidt orthonormalization procedure, as a function of vj, is analytic at 
«i(a). Consequently, U\{x) is also analytic at a (since it is a composition of two analytic functions). 
Following similar arguments, it can be shown that U (x) and S(x) = U^(x)A(x)U (x) may be chosen 
to have the desired property in the first claim. 
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To show the second claim, without loss of generality, we may assume that A(x) is a 2 x 2 matrix 
with nonreal eigenvalues (so it is already in a real Schur form). Let 


La 3 (x) a 4 (x) 

In view of the assumption, we have aj(a) ^ a 4 (a). Let 


*( x ) = \J( a i( x ) - 04 {x )) 2 + (a 2 (x) + a 3 (x)) 2 
Then, it is tedious but straightforward to check that 


0 =- f s i n -i 2fl 3(g) ~ C3(s) + az(j) _ -1 o 4 (x)-ai(x) \ 

2 V <(*) <(*) ) 


and the orthogonal matrix 


are all analytic at a, and 


U(x) = 


cos 8 X — sin 9 X 
sin 8 X cos 8 t 


U t (x)A(x)U(x) = S(x) = f * * 

a 3 (a) * 

The proof for the third claim will given at the end of this section. □ 

Lemma 5.1. Let S £ S n and B 6 !R nXn . Assume that S has all distinct eigenvalues. Then, among 
all skew-symmetric matrices in R" xn , there exists a unique P such that P T S + SP + B is upper 
triangular. 

Proof. We show the claim by a recursive construction. Let e\ , . . . , e n denote the standard basis 
vectors in R n . For * = 1, . . ., n - 1, define £, = [e t+1 • • • e n ] € R nx(n -\ p, = EjPt { € R n_ ‘, and 
5, = EjSEi € K.( n -‘)x ("-«). Thus, any skew-symmetric matrix P is uniquely determined by the 
vectors pi, . . .,p n -i- Suppose that pj, . . ., p k -\ have been obtained for some k. We then show how 
to proceed. Let S = {s,j}. First, let us assume s kt k-i = sjt+i,* = 0 (so, s kk is an eigenvalue of 5). 
Since P T S + SP + B is upper triangular, we have 

El (p T S -1- SP + B) e k = 0 

which is equivalent to 

*-i 

~~ y! P — (sfckl ~ Sk)pk 4" Be % = 0 

»=1 

Thus, pfc can be solved and is equal to 

Pk = (Skkl - S k )-' SikElP T ei + ElBek'j (5.7) 
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Notice that the right hand side of (5.8) does not depend upon .,p n _i. Also, since S has 
distinct eigenvalues by assumption, the matrix s kk I - S k in (5.7) is invertible. 

Second, we assume s*_ i t * = 0 and s k+ itk ^ 0. Again, since P T S + 5P+ B is upper triangular, 
we have 

f EJ (P T S + S P + Bi) e k =0 
\ El„(pTS + SP + B)e M =0 

which is equivalent to 

“ jC »ikElP T ei - ( s kk I - S k )p k - s k+ i <k ElP T e k +i + ElBe k = 0 

k-i 

~ pTe ' ~ ( 5 *+i,*+i ^ “ Sk+i)pk+i ~ Sk,k+iEl +1 P T e k + El +1 Be k+i = 0 

- £ Si k ElP T ei + El Be k 

k- i ’ =1 „ (5.8) 

— s i,k+iE k+i P T €i + El +l Be k +i _ 

where 

$ = 


c * = Skk - Sk+i,k+i , and ® denotes the Kronecker sum. Again, it is easy to check that the right hand 
side of (5.8) does not depend upon p k + 2 , • ••,?„. Also, in view of the definition of S n (which implies 
c i = &k k — 3*+i,fc+i ^ 0), the assumption that S has distinct eigenvalues, and the eigenvalues of 
Kronecker sum of two matrices, we conclude that $ is invertible. Finally, the uniqueness of P is 
obvious from the construction. □ 

In the sequel, given S G S n with distinct eigenvalues and B G R nxn , we will denote by 

P = A(S,B) (5.9) 

the unique skew-symmetric matrix P that constructed by the recursive procedure given in the proof 

of Lemma 5.1. Using (5.9), we show below how to derive the first and second order derivatives of 
the real Schur form. 

Theorem 5.2. Suppose that A(z) is analytic at o 6 R m and A(a ) has all distinct eigenvalues. Let 
U(a)S{a)U T (a) be a real Schur decomposition of A(a) and assume S(a) G S n . Let U(x) and S(x ) be 
some analytic functions given in Theorem 5.1 («). For i, j = 1 , . . . , m, define = U T (a)^^U(a) 
and Pi = A(5(a), £,). Then, for » = 1, . . ., m, it holds that 

d5Ya) T 

-Q^ = PlS(a) + S(a)P t + Bi 
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Furthermore, for i,j = 1 , . . . , m, define 

B _ pT ^Sja) dSja) T Bp T jT (a ^ 2 A(a) 

Bx3 ~ p > d Xi + d Xi P > + Pi B} + B > p ' + U ( a) te&; u{a) 

and P{j = A(S(a), Bij). Then, for i, j = 1, . . . , m, it holds that 

d^S(a) 7- 

diidxj = Pij + S ( a ^ P '} + B{ i 


Proof. For * = let = U(x)Pi(x) for some P(x) € R nx ". It is clear that P,(x) is also 

analytic. Differentiating the identity l/^(x)U(x) = J with respect to x,- yields 

/f(x) + P,(x) = 0 

i.e., Pi(x) is skew-symmetric for all x. Also, differentiating the identity S(x) = U T (x)A(x)U(x) 
with respect to x t at a yields 

dS(a) T 

= Pf (a)5(a) + 5(a)P,(a) + B t 

By the property (*i) concerning 5(x) in Theorem 5.1, we have that is an upper triangular 
matrix. In view of Lemma 5.1, Pi(a) is unique and equal to P{(a) = A(S(a),Bi). This proves 
the first claim. The second claim follows similar arguments. Since p(x ) is skew-symmetric for all 
x, we have that ^ ^ is also skew-symmetric for all x. Then it is straightforward to check that 
differentiating the identity 5(x) = U T {x)A(x)U(x ) twice at a yields 

d^S(a) T 

= i£(a)S(a) + S(«)i>y + 

where Pij(a) = ^ ^ . Again, in view of Lemma 5.1, Pij(a) is unique and equal to Pij(a) = 

A(S(a), Bij). This completes the proof. □ 

We now turn to the uniqueness question about the real Schur form 5(x) of A(x) given in 
Theorem 5.1 (in). ^Prom the proof of Theorem 5.2, it is seen that the first and second order 
derivatives of 5(x) at a are uniquely defined by 5(a), U(a) and the derivative at A{x) at a. In 
fact, this result holds for higher order derivatives of 5(x) as well. Therefore, given two choices of 
analytic functions (t/(x),5(x)} and {f/(x),5(x)} in Theorem 5.1 («), it holds that 5(a) = 5(a) 
and all derivatives of 5(x) and 5(x) at a coincide with each other. Thus, we must have 5(x) = 5(x) 
for all x. By the same argument, we can conclude also that U(x) = U(x) for all x. This proves the 
third claim of Theorem 5.1. 
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C.5.3 Derivatives of eigenvalues 


Using the results on the derivatives of real Schur form, we are now ready to give the derivatives of 
the eigenvalues. 


Theorem 5.3. Suppose that A(a) has all distinct eigenvalues. Let U(a)S(a)U T (a) be a real Schur 
decomposition of A(a) and assume 5(a) € S n . Also, let l/(x) and 5(x ) be the unique analytic 
function given in Theorem 5.1 (it) and (tit). Then, the first and second order derivatives of the 
eigenvalues of A(x) at a can be computed as follows. If A(a) is a real eigenvalue of A(a), i.e., A(a) 
is a diagonal element of 5(a), say, s(n), then 


3A(a) = dsja ) fl 2 A (a) fl 2 s(a) 

dxi dij dx idij diidxj 


for i,j = 1,. . .,m 


Now, suppose that A(a) is a nonreal eigenvalue of A(a) with positive imaginary part, i.e., A(a) is 
an eigenvalue of a 2 x 2 diagonal block of 5(a), say, 


si(a) s 2 (a) 

s 3 (a) 54(a) 


Define 

m _ dskja ) (fc) _ <Psk(a) _ I 1 

dxi ’ 7 *> dxidx } ' V A ( a )-M“) 

and yi = (si(a) — 54 (a)) ( 7 ,-^ - + 2 53 (a) 7 f 3 \ Then, for i,j = 1,. . .,m, we have 


(5.10) 


d Xja) 
d n 

0 2 A (q) 
dx^Xj 


= \ (7i X) + 7 i' 4) + Vi r) 

= \ (l r S ) + ~ V, Vj r 3 + 2 53(a) t + ( 7 J 1} 

+ r (5 X (a) - 54(a)) (jfp - 7W) 



Proof. The result is obvious when A(a) is real. When A(a) is nonreal, one can first find its analytic 
expression in terms of the elements of the matrix in (5.10), and then apply the chain rule for its 
derivatives. □ 


C.5.4 Derivatives of fi(x) and f 2 (x) 

In thi6 section, we give the expressions for various quantities that needed in Algorithm 4.1. They 
can be easily obtained by using the preceding results and the chain rule. Here, given a vector t, we 
denote by (<), its t-th element, and given a matrix T, we denote by (Tfj its ij-th element. Also, to 

g— 1 g g 

simplify the notation, we denote £ Yl by £. 

k=i i=k+i k,i 
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9i(x) = 

#i(*) = 

9 *=i 

92{x) = £(RA*(*)-RAK*))(«*-«,) 
H 2 (x) = H 2i (x) + H 22 ( x) 


Z?(x, h) = £ ((2* - - t,) r + (/* - t,)h T (7* - T,)) 

k,l ' 

k t l ' ' 

# 21(20 = J 2 (h - ti)(tk - ti) T 
k,l 


H 22 {x) 


(tk)i 


m a 


E(RA fc (*)-»A,(*))(r*-r,) 

k,i 

^dXkix) 

dxi 

s «! Mi) 

dxidij 


C.6 A numerical example 

The proposed algorithm has been implemented in Matlab [6]. In this section, we present a simple 
example to illustrate its convergence property. The example is to minimize over R 2 the largest real 
part of the eigenvalues of A{x), where 4(x) is defined by 


A(x) — Aq + xi,4i + x 2 A 2 + xiX2i4a + x^A 4 + x\As 
and the matrices A{, i = 0, . . . , 5 are given as follows. 


0.2 

1.0 

- 0.2 

0.0 

- 6.6 ' 


4.1 

7.2 

5.7 

- 3.5 

- 6.4 ' 

0.7 

- 3.1 

- 1.5 

5.3 

- 1.5 


- 2.3 

7.2 

5.1 

- 1.0 

- 4.2 

- 1.4 

2.8 

4.6 

0.9 

- 6.3 

Ai = 

0.8 

1.9 

- 6.4 

- 4.6 

- 0.3 

3.6 

5.8 

- 3.9 

3.0 

- 8.0 


2.8 

7.4 

- 0.7 

0.6 

- 3.2 

0.3 

9.1 

- 3.2 

2.0 

1.7 


- 8.4 

7.1 

- 1.3 

- 1.9 

1.4 
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4.4 2.3 —1.1 6.6 7.8 -9.4 -3.5 1.5 —1.5 -4.8 

-4.6 9.2 -6.7 -1.8 5.9 -1.6 -0.9 0.4 0.8 0.8 

A 2 = -3.1 0.2 11. -8.0 -7.0 A 3 = -2.1 0.5 -3.8 -3.6 5.8 

-3.6 -2.8 -4.2 13. 2.7 7.7 2.3 3.7 -6.6 -2.7 

3.1 -2.6 4.3 0.9 10. -4.9 0.3 2.5 2.0 -1.5 

’ -3.8 6.2 -0.3 -5.8 -7.1 1 [ 15. -0.0 -4.2 -3.8 0.6 

-7.5 0.2 3.9 -4.7 3.1 5.8 12. 2.8 -3.6 4.5 

A 4 = -0.4 -0.7 -3.7 -0.1 -1.8 As = 5.3 -1.1 9.4 5.0 4.6 

-2.4 -4.2 2.4 6.9 -0.3 -9.7 4.4 4.2 13. 0.1 

-0.2 -5.5 0.3 5.7 8.4 4.8 -3.7 -8.9 6.7 15. 

A local solution is 

, _ [ 0.14867145915551 

X ~ [ -0.38655872292658 

which is obtained by trial and error. The corresponding eigenvalues of A(x) at z* are 

‘ Aj(x*) 1 [ 3.96924962356182 + t 7.73645446194478 

A 2 (z*) 3.96924962356182 - t 7.73645446194478 

A 3 (z*) = 3.96924962356181 

A„(x*) -1.78038425406443 

A 5 (z*) J [ -10.02592118139874 

Thus, the multiplicity of <f>(x m ) is 3. Some level curves of <f>(x) around x* are given in Figure 6.1 (x* 
is located at the center of the figure). Also, in Figure 6.2, solid lines are the level curves of /i(x) 
and the dotted line is the feasible set {x : / 2 (x) = 0}. 

We set g = 3 and choose the starting point x = 0 in running Algorithm 4.1. In order to 
ensure convergence in a larger neighborhood around x*, we perform line search along h(x*) at each 
iteration. We choose Ik to be the smallest integer in {0, 1,2, — } such that defined by 

x t+1 = x k + (0.5/* h(x k ) 

strictly decreases the objective function, i.e., <^>(x*'*' 1 ) < <j>(x k ). The test result is summarized in 
Table 6.1. In Table 6.1, the second column is the largest real part of the eigenvalues of A(x fc ); the 
third and fourth columns together is the optimality condition for a local minimizer (both values 
are zero at the minimizer); and the last column is the distance between x k and the local minimizer 
x*. The integer /* used line search is one for the first iteration and zero for the rest. The result 
shows that the rate of convergence is indeed quadratic. 

Acknowledgements. The work of this research was supported in part by NASA- Ames. 
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Iter k 

4>{x k ) 

M xk ) 

ll^ T (**)Si(**)|| 

II** “ *11 

0 

6.960 

2.6e+01 

2.4e+01 

4.1e— 01 

1 

5.945 

5.9e+01 

9.9e— 01 

2.6e-01 

2 

5.754 

7.9e+01 

4.8e+00 

1.3e— 01 

3 

4.515 

2.1e+00 

5.2e+00 

1.2e— 01 

4 

4.000 

2.2e-02 

3.8e— 02 

3.0e-03 

5 

3.96958 

2.5e— 06 

l.le-03 

2.7e— 05 

6 

3.969249645 

l.le-14 

*— * 

to 

1 

o 

00 

2.3e-09 

7 

3.96924962356182 

7.9e-31 

1.2e— 14 

1.0e-15 


Table 6.1 
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